Plasma Virome Reveals Blooms and Transmission of Anellovirus in Intravenous Drug Users with HIV-1, HCV, and/or HBV Infections

ABSTRACT Intravenous drug users (IDUs) are a high-risk group for HIV-1, hepatitis C virus (HCV), and hepatitis B virus (HBV) infections, which are the leading causes of death in IDUs. However, the plasma virome of IDUs and how it is influenced by above viral infections remain unclear. Using viral metagenomics, we determined the plasma virome of IDUs and its association with HIV-1, HCV, and/or HBV infections. Compared with healthy individuals, IDUs especially those with major viral infections had higher viral abundance and diversity. Anelloviridae dominated plasma virome. Coinfections of multiple anelloviruses were common, and anelloviruses from the same genus tended to coexist together. In this study, 4,487 anellovirus ORF1 sequences were identified, including 1,620 (36.1%) with less than 69% identity to any known sequences, which tripled the current number. Compared with healthy controls (HC), more anellovirus sequences were observed in neg-IDUs, and HIV-1, HCV, and/or HBV infections further expanded the sequence number in IDUs, which was characterized by the emergence of novel divergent taxons and blooms of resident anelloviruses. Pegivirus was mainly identified in infected IDUs. Five main pegivirus transmission clusters (TCs) were identified by phylogenetic analysis, suggesting a transmission link. Similar anellovirus profiles were observed in IDUs within the same TC, suggesting transmission of anellome among IDUs. Our data suggested that IDUs suffered higher plasma viral burden especially anelloviruses, which was associated with HIV-1, HCV, and/or HBV infections. Blooms in abundance and unprecedented diversity of anellovirus highlighted active evolution and replication of this virus in blood circulation, and an uncharacterized role it may engage with the host. IMPORTANCE Virome is associated with immune status and determines or influences disease progression through both pathogenic and resident viruses. Increased viral burden in IDUs especially those with major viral infections indicated the suboptimal immune status and high infection risks of these population. Blooms in abundance and unprecedented diversity of anellovirus highlighted its active evolution and replication in the blood circulation, and sensitive response to other viral infections. In addition, transmission cluster analysis revealed the transmission link of pegivirus among IDUs, and the individuals with transmission links shared similar anellome profiles. In-depth monitoring of the plasma virome in high-risk populations is not only needed for surveillance for emerging viruses and transmission networks of major and neglected bloodborne viruses, but also important for a better understanding of commensal viruses and their role it may engage with immune system.

that HIV infection would greatly increase the abundance and diversity of anellovirus. We added a few points about the relationship with our previous study, and also mentioned the limitations of the methods used. Line: 428-437.
Thanks for the suggestion. We also checked the results of circoviridae, a full viral genome (GenBank: ON226770) that were most closely related to porcine circovirus 3 (<71% identity) were found in one IDU sample. As high-risk populations, IDUs may carry emerging or neglected viruses. It would be interesting to screen more IDUs and/or other high-risk individuals for their prevalence, and investigate the possible origin and transmission of this virus. A few points were added in the discussion. Line 423-427.

On the interpretation of the data, the authors should not assume causality. You have no evidence that the HBV, HCV, or HIV infections caused anything and you have absolutely no data on immune responses. It seems a PWID who already acquired one or more virus infection in blood not surprisingly
will have others. Did those PWID get HIV, HCV, or HBV because they were more immunosuppressed than the PWID who didn't? It is widely accepted that they acquire those infections because they are more exposed. More than likely, it will be the same story for anellovirus but, until proven, it may be wise to just observe what you found and not speculate (with no evidence) on how.
Response: Thanks for the suggestion. In our revised MS, we removed several discussions to avoid arbitrary speculations.

How did the authors deal with alignment of the short reads produced by the Novaseq. Given the lack of clear reference sequences, how did you train Megahit to correctly align your sequences? Could some of the apparent diversity be due to problem with alignment? How do you perform the phylogenetic analysis when you aren't confident you are comparing viruses with a common ancestry? For example, one wouldn't include dengue and HCV together to make transmission inferences. How did you come up with your thresholds for clusters?
Response: Megahit was described in 2015 (Bioinformatics. 2015 May 15;31(10):1674-6), and is the most popular used program in metagenomics. It was tested by many studies for having good performance (such as BMC Genomics. 2021 Nov 24;22(1):849;Genes Genomics. 2019 Sep;41(9):1077-1083Brief Bioinform. 2020 May 21;21(3):777-790;J Microbiol Methods. 2018 Aug;151:99-105). To achieve accurate annotation of NGS short reads, we used both assembled contigs and reads for the virus annotation (many studies only used short reads for the analysis, such as SURPI method, Legoff et al. Nat Med 2017;Gu et al. Nat Med 2021), and virus candidates were further checked for potential false positives. We believe our methods are both stringent and sensitive enough. The method was also described in previous studies by us and others (Li et al. J Virol 2021;Li et al. Viruses 2020;Li et al. AIDS 2020;Liu et al. mSphere 2021;Siqueira et al. Nat Commun 2018;Zhao et al. Virology 2017).
The phylogenetic analysis was performed for anellovirus and pegivirus separately, and only related virus sequences (either anellovirus or pegivirus) were included in each phylogenetic tree. In order to determine the possible transmission traits of anellovirus and pegivirus, we first analyzed the phylogenetic relationship of all pegiviruses, and found many sequences were clustered closely (similar to other bloodbone viruses, e.g. HIV and HCV), and then we identified transmission clusters with a stringent threshold of 1% genetic distance (generally 1.5-4.5% distance was used for HIV-1 and HCV; J Infect Dis. 2014 Jan 15;209(2):304-13.; Sci Rep. 2016 Oct 4;6:34729;Hepatology. 2021 Oct;74(4):1782-94.). Pegivirus and HCV relate to each other on the phylogenetic tree and share similar genomic structures, so the threshold of 1% distance is stringent and accurate for cluster picking.
After identifying transmission clusters based on the pegivirus phylogeny, we compared the diversity and genetic distance of anellovirus among individuals within each cluster as well as between clusters. So, the transmission of anellovirus was not determined by direct comparison between pegivirus and anellovirus, instead it was determined through the analyzing of the relatedness of the anellovirus between linked individuals (also see recent study: Abbas et al. Am J Transplant. 2019 April;Kandathil AJ, et al. Nat Commun. 2021.) We added more descriptions in the method of the revised ms to clarify this. Line: 164-170.

You appear to have detected reads for the the already recognized viruses. To what extent did the abundance in the HBV group relate to simply HBV DNA reads vs some intrinsic difference in the group? You might tell us exactly what you detected to give us a sense of your overall sensitivity. For example, did you correlate qPCR for HBV, HCV, and HIV with your metagenomic read number?
Response: Metagenomic sequencing is widely used for scientific purpose, and more and more in clinical practice. Many studies have evaluated NGS as a powerful tool as compared to those traditional methods. We compared the detection (positive or negative) of these viruses by both methods, and NGS could reach a generally high consistency with the qPCR results: 24(NGS)/26(PCR) for HIV+ group, 11/11 for HCV+ group, 28/29 for both HIV+ and HCV+ group, 9/10 for HBV+ group. These data indicated that we had a good sensitivity.
We also agree that it is helpful to correlate the viral reads with qPCR viral loads to validate the NGS results. Unfortunately, the initial PCR screening was not done by quantitive method, so we used the remaining samples (several samples had run out) to measure TTV, as well as HIV, HCV, and HBV again with qPCR methods. We found significant correlation for TTV Ct value and reads. However, for the other three viruses, significant correlation was only observed for HBV. Because we used a WTA kit for the enrichment of viral nucleic acids (common in virome studies), it may preferentially amplify TTV genomes (circular) and disturbs the correlations of HIV and HCV. We added a few points of the limitations of the methods used. Line: 423-427.
A new Table S1 and Figure S1 of the comparison between NGS and qPCR was provided in the revised ms. New  Figure S1. Correlations between qPCR and NGS reads: The paper needs some grammatical assistance in English, which is understandable (and available). Response: Thanks for the suggestion. We carefully checked and revised our languages throughout the manuscript.

Li et al examines the human virome of intravenous drug users using metagenomic sequencing and describes how the virome differs between healthy individuals and patients with chronic viral infections (HIV, HBV, HCV). The authors report higher viral burden and diversity in IDUs and further viral abundance in IDUs infected with HIV, HCV, and HBV. They describe blooms of anelloviruses in IDUs compared to controls, confirming the interesting and underexplored role of these ubiquitous viruses in humans. Some questions/suggestions that I think will strengthen this manuscript. 1. Many of the results describe the viral composition aggregated across all study participants or groups of study participants. Please present the presence of virus reads broken down by individual.
Response: Thanks for the suggestion. We provided a new Figure S2 (related to Figure 1b&c) that displayed the relative abundance of virus reads by each individual.
New Figure S2. Relative abundance of main vertebrate viruses (a) and prokaryotic viruses (bacteriophages)(b) by individual.

This study examined IDUs and IDUs positive for HIV, HBV, and HCV. What viral sequencing results (RPM, assemblies) can you report from HIV, HBV, and HCV in the study participants?
Response: We compared the detection (positive or negative) of these viruses by both methods, and NGS could reach a generally high consistency with the qPCR results: 24(NGS)/26(PCR) for HIV+ group, 11/11 for HCV+ group, 28/29 for both HIV+ and HCV+ group, 9/10 for HBV+ group. These data indicated that NGS had a good sensitivity (94.7%). Besides, we also performed the correlation analyses between qPCR and NGS reads for anellovirdae, HIV-1, HCV and HBV. We found significant correlation for TTV Ct value and NGS reads. However, for the other three viruses, significant correlation was only observed for HBV. Because we used a WTA kit for the enrichment of viral nucleic acids (common in virome studies), it may preferentially amplify TTV genomes (circular) and disturbs the correlations of HIV and HCV. We added a few points of the limitations of the methods used. Line: 423-427. (Please also see the response to reviewer 1) A new Table S1 and Figure S1 of the comparison between NGS and PCR was provided in the revised MS. New HIV-1&HCV (n=29) 28 96.5% HBV (n=10) 9 90% New Figure S1. Correlations between qPCR and NGS reads:

Please describe any steps taken in your sequencing analysis pipeline to remove false-positive reads derived from contaminants such as plasmids and vectors.
Response: Human-and bacterium-depleted sequences were first searched against viral nucleic acid and protein database, then all the viral hit candidates were searched against the NCBI nonredundant nt and nr database to remove reads or contigs that have higher similarity to sequences related to host, bacteria, fungi, plasmids, vectors and other non-viral sequences than to viral sequences (false positives). We revised the description of this point in method, line 138-141.

Longitudinal samples were unavailable for this study, and patient age may influence virome composition and diversity. Were any changes in the virome make-up observed by duration of drug use or patient age?
Response: Thanks for the important suggestion. We analyzed the associations of viral abundance and diversity with age and duration of drug use. We found that age didn't influence the virome composition; however, a longer time of drug use was associated with higher viral reads and richness. The data indicate that longer drug use may lead to higher risk of viral infections. A new Figure S3 was provided in the revised ms. Figure S3. Influence of age (top) and duration of drug use (bottom) on the blood viral composition. Spearman's correlation was analyzed between age/duration of drug use and viral reads (RPM), Richness, Shannon index.

Previous blood virome studies report Herpesviridae as a common component, yet they appear largely undetected in this study (Fig1). How do the authors reconcile those findings with prior studies?
Response: Herpesviridae could be prevalent in general population, but most infections are asymptomatic or latent infections, which can be activated under certain disease status or impaired immune system. Patients on tissue transplantation are usually under immune suppression, and virome studies of these individuals had higher detection rate of Herpesviridae. See: We searched other blood virome studies of different cohorts, and most studies also found low proportions or the absence of Herpesviridae. For example: Exploring the Diversity of the Human Blood Virome. Viruses. 2021 Nov;13(11)